Nucleic acids from cassava encoding starch branching enzyme II (SBEII) and their use

ABSTRACT

This invention provides isolated cassava nucleic acids encoding starch branching enzyme II (SBEII); constructs, vectors and host cells comprising the nucleic acids; and methods of using the nucleic acids to alter gene expression in cassava to obtain starch with altered properties.

FIELD OF THE INVENTION

This invention relates to novel nucleic acid sequences, vectors and host cells comprising the nucleic acid sequence(s), to polypeptides encoded thereby, and to a method of altering a host cell by introducing the nucleic acid sequence(s) of the invention.

BACKGROUND OF THE INVENTION

Starch consists of two main polysaccharides, amylose and amylopectin. Amylose is a linear polymer containing α-1,4 linked glucose units, while amylopectin is a highly branched polymer consisting of a α-1,4 linked glucan backbone with α-1,6 linked glucan branches. In most plant storage reserves amylopectin consitutes about 75% of the starch content. Amylopectin is synthesized by the concerted action of soluble starch synthase and starch branching enzyme [α-1,4 glucan: α-1,4 glucan 6-glycosyltransferase. EC 2.4.1.18]. Starch branching enzyme (SBE) hydrolyses α-1,4 linkages and rejoins the cleaved glucan, via an α-1,6 linkage, to an acceptor chain to produce a branched structure. The physical properties of starch are strongly affected by the relative abundance of amylose and amylopectin, and SBE is therefore a crucial enzyme in determining both the quality and quality of starches produced in plant systems.

Starches are commercially available from several plant sources including, maize, potato and cassava. Each of these starches has unique physical characteristics and properties and a variety of possible industrial uses. In maize there are a number of naturally occurring mutants which have altered starch composition such as high amylopectin types (“waxy” starches) or high amylose starches but in potato and cassava no such mutants exist on a commercial basis as yet.

Genetic modification offers the possibility of obtaining new starches which may have novel and potentially useful characteristics. Most of the work to date has involved potato plants because they are amenable to genetic manipulation i.e. they can be transformed using Agrobacterium and regenerated easily from tissue culture. In addition many of the genes involved in starch biosynthesis have been cloned from potato and thus are available as targets for genetic manipulation, for example, by antisense inhibition of expression or sense suppression.

Cassava (Manihot esculenta L. Crantz) is an important crop in the tropics, where its starch-filled roots are used both as a food source and increasingly as a source of starch. Cassava is a high yielding perennial crop that can grow on poor soils and is also tolerant of drought. Cassava starch being a root-derived starch has properties similar but not identical to potato starch and is composed of 20-25% amylose and 75-80% amylopectin (Rickard et al., 1991. Trop. Sci. 31, 189-207). Some of the genes involved in starch biosynthesis have been cloned from cassava, including starch branching enzyme I (SBE I) (Salehuzzaman et al., 1994 Plant Science 98, 53-62), and granule bound starch synthase I (GBSS I) (Salehuzzaman et al., 1993 Plant Molecular Biology 23, 947-962) and some work has been done on their expression patterns although only in in vitro grown plants (Salehuzzaman et al., 1994 Plant Science 98, 53-62).

In most plants studied to date e.g. maize (Boyer & Preiss. 1978 Biochem. Biophys. Res. Comm. 80, 169-175), rice (Smyth. 1988 Plant Sci. 57, 1-8) and pea (Smith, Planta 175, 270-279), two forms of SBE have been identified, each encoded by a separate gene. A recent review by Burton et al., (1995 The Plant Journal 7, 3-15) has demonstrated that the two forms of SBE constitute distinct classes of the enzyme such that, in general, enzymes of the same class from different plants may exhibit greater similarity than enzymes of different classes from the same plant. In their review, Burton et al. termed the two respective enzyme families class “A” and class “B”, and the reader is referred thereto (and to the references cited therein) for a detailed discussion of the distinctions between the two classes. One general distinction of note would appear to be the presence, in class A SBE molecules, of a flexible N-terminal domain, which is not found in class B molecules. The distinctions noted by Burton et al. are relied on herein to define class A and class B SBE molecules, which terms are to be interpreted accordingly.

Many organisations have interests in obtaining modified Cassava starches by means of genetic modification. This is impossible to achieve however, unless the plant is amenable to transformation and regeneration, and the starch biosynthesis genes which are to be targeted for modification must be cloned. The production of transgenic cassava plants has only recently been demonstrated (Taylor et al., 1996 Nature Biotechnology 14, 726-730; Schöpke et al., 1996 Nature Biotechnology 14, 731-735; and Li et al., 1996 Nature Biotechnology 14, 736-740). The present invention concerns the identification, cloning and sequencing of a starch biosynthetic gene from Cassava, suitable as a target for genetic manipulation.

SUMMARY OF THE INVENTION

In a first aspect the invention provides a nucleic acid sequence encoding a polypeptide having starch branching enzyme (SBE) activity, the polypeptide comprising an effective portion of the amino acid sequences shown in FIG. 4 or FIG. 13. The nucleic acid is conveniently in substantial isolation, especially in isolation from other naturally associated nucleic acid sequences.

An “effective portion” of the amino acid sequences may be defined as a portion which retains sufficient SBE activity when expressed in E. coli KV832 to complement the branching enzyme mutation therein. The amino acid sequences shown in FIGS. 4 and 13 include the N terminal transit peptide, which comprises about the first 50 amino acid residues. As those skilled in the art will be well aware, such a transit peptide is not essential for SBE activity. Thus the mature polypeptide, lacking a transit peptide, may be considered as one example of an effective portion of the amino acid sequence shown in FIG. 4 or FIG. 13.

Other effective portions may be obtained by effecting minor deletions in the amino acid sequence, whilst substantially preserving SBE activity. Comparison with known class A SBE sequences, with the benefit of the disclosure herein, will enable those skilled in the art to identify regions of the polypeptide which are less well conserved and so amenable to minor deletion, or amino acid substitution (particularly, conservative amino acid substitution) whilst substantially preserving SBE activity. Such less well-conserved regions are generally found in the N terminal amino acid residues (up to the triple proline “elbow” at residues 138-140 in FIG. 4 and up to the proline elbow at residues 143-145 in FIG. 13) and in the last 50 residues or so of the C terminal, and in particular in the acidic tail of the C terminal.

Conveniently the nucleic acid sequence is obtainable from cassava, preferably obtained therefrom, and typically encodes a polypeptide obtainable from cassava. In a particular embodiment, the encoded polypeptide may have the amino acid sequence NSKH (SEQ ID NO: 32) at about position 697 (in relation to FIG. 4 (SEQ. ID. NO. 29)), which sequence appears peculiar to an isoform of the SBE class A enzyme of cassava, other class A SBE enzymes having the conserved sequence DA D/E Y (SEQ ID NO: 33) (Burton et al., 1995 cited above).

In a particular aspect of the invention there is provided a nucleic acid comprising a portion of nucleotides 21 to 2531 of the nucleic acid sequence shown in FIG. 4, or a functionally equivalent nucleic acid sequence. Such functionally equivalent nucleic acid sequences include, but are not limited to, those sequences which encode substantially the same amino acid sequence but which differ in nucleotide sequence from that shown in FIG. 4 by virtue of the degeneracy of the genetic code. For example, a nucleic acid sequence may be altered (e.g. “codon optimised”) for expression in a host other than cassava, such that the nucleotide sequence differs substantially whilst the amino acid sequence of the encoded polypeptide is unchanged. Other functionally equivalent nucleic acid sequences are those which will hybridise under stringent hybridisation conditions (e.g. as described by Sambrook et al., Molecular Cloning. A Laboratory Manual, CSH, i.e. washing with 0.1×SSC, 0.5% SDS at 68° C.) with the sequence shown in FIG. 4. FIG. 10 shows a functionally equivalent sequence designated “125+94”, which includes a region corresponding to the 3′ coding portion of the sequence in FIG. 4. FIG. 13 shows a functionally equivalent sequence which comprises a second complete SBE coding sequence (the SBE-derived sequence is from nucleotides 35 to 2760, of which the coding sequence is nucleotides 131-2677, the rest of the sequence in the figure is vector-derived).

Functionally equivalent DNA sequences will preferably comprise at least 200-300 bp, more preferably 300-600 bp, and will exhibit at least 88% identity (more preferably at least 90%, and most preferably at least 95% identity) with the corresponding region of the DNA sequence shown in FIGS. 4 or 10. Those skilled in the art will readily be able to conduct a sequence alignment between the putative functionally equivalent sequence and those detailed in FIGS. 4 or 10—the identity of the two sequences is to be compared in those regions which are aligned by standard computer software, which aligns corresponding regions of the sequences.

In particular embodiments the nucleic acid sequence may alternatively comprise a 5′ and/or a 3′ untranslated region (“UTR”), examples of which are shown in FIGS. 2 and 4. FIG. 9 includes a 3′ UTR, as nucleotides 688-1044 and FIG. 10 includes 3′ UTR as nucleotides 1507-1900 (which nucleotides correspond to the first base after the “stop” codon to the base immediately preceding the poly (A) tail). Any one of the sequences defined above, or a functional equivalent thereof (as defined by hybridisation properties, as set out in the preceding paragraph), could be useful in sense or anti-sense inhibition of corresponding genes, as will be apparent to those skilled in the art. It will also be apparent to those skilled in the art that such regions may be modified so as to optimise expression in a particular type of host cell and that the 5′ and/or 3′ UTRs could he used in isolation, or in combination with a coding portion of the sequence of the invention. Similarly, a coding portion could be used without a 5′ or a 3′ UTR if desired.

In a further aspect, the invention provides a replicable nucleic acid construct comprising any one of the nucleic acid sequences defined above. The construct will typically comprise a selectable marker and may allow for expression of the nucleic acid sequence of the invention. Conveniently the vector will comprise a promoter (especially a promoter sequence operable in a plant and/or a promoter operable in a bacterial cell) and one or more regulatory signals known to those skilled in the art.

In another aspect the invention provides a polypeptide having SBE activity, the polypeptide comprising an effective portion of the amino acid sequence shown in FIG. 4 (SEQ. ID. NO. 29) or FIG. 13 (SEQ. ID. NO. 31). The polypeptide is conveniently one obtainable from cassava, although it may be derived using recombinant DNA techniques. The polypeptide is preferably in substantial isolation from other polypeptides of plant origin, and more preferably in substantial isolation from any other polypeptides. The polypeptide may have amino acid residues NSKH (SEQ ID NO: 32) at about position 697 (in the sequence shown in FIG. 4 (SEQ. ID. NO. 29)), instead of the sequence DA D/E Y (SEQ ID NO: 33) found in other SBE class A polypeptides. The polypeptide may be used in a method of modifying starch in vitro, the method comprising treating starch under suitable conditions (of temperature, pH etc.) with an effective amount of the polypeptide.

Those skilled in the art will appreciate that the disclosure of the present specification can be utilised in a number of ways. In particular, the characteristics of a host cell may be altered by recombinant DNA techniques. Thus, in a further aspect, there is provided a method by which a host cell may be altered by introduction of a nucleic acid sequence comprising at least 200 bp and exhibiting at least 88% sequence identity (more preferably at least 90%, and most preferably at least 95% identity) with the corresponding region of the DNA sequence shown in FIGS. 4, 9, 10 or 13, operably linked in the sense or (preferably) in the anti-sense orientation to a suitable promoter active in the host cell, and causing transcription of the introduced nucleic acid sequence, said transcript and/or the translation product thereof being sufficient to interfere with the expression of a homologous gene naturally present in said host cell, which homologous gene encodes a polypeptide having SBE activity. The altered host cell is typically a plant cell, such as a cell of a cassava, banana, potato, sweet potato, tomato, pea, wheat, barley, oat, maize, or rice plant.

Desirably the method further comprises the introduction of one or more nucleic acid sequences which are effective in interfering with the expression of other homologous gene or genes naturally present in the host cell. Such other genes whose expression is inhibited may be involved in starch biosynthesis, (e.g. an SBE I gene), or may be unrelated to SBE II.

Those skilled in the art will be aware that both anti-sense inhibition, and “sense suppression” of expression of genes, especially plant genes, has been demonstrated (e.g. Matzke & Matzke 1995 Plant Physiol. 107, 679-685).

It is believed that antisense methods are mainly operable by the production of antisense mRNA which hybridises to the sense mRNA, preventing its translation into functional polypeptide, possibly by causing the hybrid RNA to be degraded (e.g. Sheehy et al., 1988 PNAS 85, 8805-8809; Van der Krol et al. Mol. Gen. Genet. 220, 204-212). Sense suppression also requires homology between the introduced sequence and the target gene, but the exact mechanism is unclear. It is apparent however that, in relation to both antisense and sense suppression, neither a full length nucleotide sequence, nor a “native” sequence is essential. Preferably the nucleic acid sequence used in the method will comprise at least 200-300 bp, more preferably at least 300-600 bp, of the full length sequence, but by simple trial and error other fragments (smaller or larger) may be found which are functional in altering the characteristics of the plant. It is also known that untranslated portions of sequence can suffice to inhibit expression of the homologous gene—coding portions may be present within the introduced sequence, but they do not appear to be essential under all circumstances.

The inventors have discovered that there are at least two class A SBE genes in cassava. A fragment of a second gene has been isolated, which fragment directs the expression of the C terminal 481 amino acids of cassava class A SBE (see FIG. 10) and comprises a 3′ untranslated region. Subsequently, a complete clone of the second gene was also recovered (see FIG. 12). The coding portions of the two genes show some slight differences, and the second SBE gene may be considered as functionally equivalent to the corresponding portion of the nucleotide sequence shown in FIG. 4. However, the 3′ untranslated regions of the two genes show marked differences. Thus the method of altering a host cell may comprise the use of a sufficient portion of either gene so as to inhibit the expression of the naturally occurring homologous gene. Conveniently, a portion of nucleotide sequence is employed which is conserved between both genes. Alternatively, sufficient portions of both genes may he employed, typically using a single construct to direct the transcription of both introduced sequences.

In addition, as explained above, it may be desired to cause inhibition of expression of the class B SBE (i.e. SBE I) in the same host cell. A number of class B SBE gene sequences are known, including portions of the cassava class B SBE (Salehuzzaman et al., 1994 Plant Science 98, 53-62) and any one of these may prove suitable. Preferably the sequence used is that which derives from the host cell sought to be altered (e.g. when altering the characteristics of a cassava plant cell, it is generally preferred to use sense or anti-sense sequences corresponding exactly to at least portions of the cassava gene whose expression is sought it be inhibited).

In a further aspect the invention provides an altered host cell, into which has been introduced a nucleic acid sequence comprising at least 200 bp and exhibiting at least 88% sequence identity (more preferably at least 90%, and most preferably at least 95% identity) with the corresponding region of the DNA sequence shown in FIGS. 4, 9, 10 or 13, operably linked in the sense or anti-sense orientation to a suitable promoter, said host cell comprising a natural gene sharing sequence homology with the introduced sequence.

The host cell may be a micro-organism (such as a bacterial, fungal or yeast cell) or a plant cell. Conveniently the host cell altered by the method is a cell of a cassava plant, or another plant with starch storage reserves, such as banana, potato, sweet potato, tomato, pea, wheat, barley, oat, maize, or rice plant. Typically the sequence will be introduced in a nucleic acid construct, by way of transformation, transduction, micro-injection or other method known to those skilled in the art. The invention also provides for a plant into which has been introduced a nucleic acid sequence of the invention, or the progeny of such a plant.

The altered plant cell will preferably be grown into an altered plant, using techniques of plant growth and cultivation well-known to those skilled in the art of re-generating plantlets from plant cells.

The invention also provides a method of obtaining starch from an altered plant, the plant being obtained by the method defined above. Starch may he extracted from the plant by any of the known techniques (e.g. milling). The invention further provides starch obtainable from a plant altered by the method defined above, the starch having altered properties compared to starch extracted from an equivalent but unaltered plant. Conveniently the altered starch is obtained from an altered plant selected from the group consisting of cassava, potato, pea, tomato, maize, wheat, barley, oat, sweet potato and rice. Typically the altered starch will have increased amylose content.

The invention will now be further described by way of illustrative examples and with reference to the accompanying drawings, in which:

FIG. 1 is a schematic illustration of the cloning strategy for cassava SBE II. The top line represents the size of a full length clone with distances in kilobases (kb) and arrows representing oligonucleotides (rightward pointing arrows are sense strand, leftward are on opposite strand). The long thick arrow is the open reading frame with start and stop codons shown. Below this are shown the 3′ RACE. 5′ RACE and PCR clones identified either by the plasmid name (shown in brackets above the line) or the clone number (shown to the left of the clone) for the 5′ RACE only. Also shown (by an x) in the 5′ RACE clones are positions of small deletions or introns.

FIG. 2 shows the DNA sequence and predicted ORF of csbe2con.seq (SEQ ID NO: 39). This sequence is a consensus of 3′ RACE pSJ94 and 5′ RACE clones 27/9,11 and 28. The first 64 base pairs are derived from the RoRidT17 adaptor primer/dT tail followed by the SBE sequence. The one long open reading frame is shown in one letter code below the double strand DNA sequence. Also shown is the upstream ORF (MQL . . . LPW). (SEQ ID NO: 40)

FIG. 3 shows an alignment of the 5′ region of cassava SBE II csbe2con (SEQ ID NO: 41) and pSJ99 (clones 20 (SEQ ID NO: 42) and 35 (SEQ ID NO: 43)) DNA sequences. Differences from the consensus sequence are shaded.

FIG. 4 shows the DNA sequence (SEQ ID NO: 28) and predicted ORF (SEQ ID NO: 29) of full length cassava SBE II tuber cDNA in pSJ107. The sequence shown is from the CSBE214 (SEQ. ID. NO. 15) to the CSBE218 (SEQ. ID. NO. 19) oligonucleotide. The DNA sequence is sequence ID No. 28 in the attached sequence listing; the amino acid sequence is Seq ID No. 29.

FIG. 5 shows an alignment of 3′ region of cassava SBE II pSJ116 (SEQ ID NO: 44) and 125+94 DNA sequences (SEQ ID NO: 45). The top line is the 125+94 sequence and the bottom SJ116 sequence.

FIG. 6 shows an alignment of carboxy terminal region of pSJ116 (SEQ ID NO: 46) and 125+94 protein sequences (SEQ ID NO: 47). The top sequence is from 125+94 and the bottom from pSJ116. Identical amino acid residues are shown with the same letter, conserved changes with a colon and neutral changes with a period.

FIG. 7 shows a phylogenetic tree of starch branching enzyme proteins. The length of each pair of branches represents the distance between sequence pairs. The scale beneath the tree measures the distance between sequences (units indicate the number of substitution events). Dotted lines indicate a negative branch length because of averaging the tree. Zmcon12.pro is maize SBE II, psstb1.pro is pea SBE I (Bhattacharyya et al 1990 Cell 60, 115-121) and atsbe2-1 & 2-2.pro are two SBE II proteins from Arabidopsis thalania (Fisher et al 1996 Plant Mol. Biol. 30, 97-108). SJ107.pro is representative of a cassava SBE II sequence, and potsbe2.pro is a potato SBE II sequence known to the inventors.

FIG. 8 is an alignment of SBE II proteins. Protein sequences are indicated in one letter code. The top line represents the consensus sequence, below which is shown the consensus ruler and the individual SBE II sequences (SEQ ID Nos: 54-59). Residues matching the consensus are shaded. Dashes represent gaps introduced to optimise alignment. Sequence identities are shown at the right of the figure and are as FIG. 7, except that SJ107.pro is cassava SBE II (SEQ ID NO: 29).

FIG. 9 shows the DNA sequence (SEQ ID NO: 48) and predicted ORF (SEQ ID NO: 49) of a cassava SBE II cDNA isolated by 3′ RACE (plasmid pSJ 101).

FIG. 10 shows the consensus DNA sequence (SEQ ID NO: 50) and predicted ORF (SEQ ID NO: 51) of a second cassava SBE II cDNA isolated by 3′ and 5′ RACE (sequence designated 125+94 is from plasmid pSJ125 and pSJ94, spliced at the CSBE217, SEQ. ID. NO. 18, oligo sequence).

FIG. 11 is a schematic diagram of the plant transformation vector pSJ64. The black line represents the DNA sequence. The hashed line represents the bacterial plasmid backbone (containing the origin of replication and bacterial selection marker) and is not shown in full. The filled triangles represent the T-DNA borders (RB=right border, LB=left border). Relevant restriction enzyme sites are shown above the black line with the approximate distances (in kiloobases) betwen sites marked by an asterisk shown underneath. The thinnest arrows represent polyadenylation signals (pAnos=nopaline synthase, pAg7=Agrobacterium gene 7), the intermediate arrows represent protein coding regions (SBE II=cassava SBE II, HYG=hygromycin resistance gene) and the thick arrows represent promoter regions (P-2×35S=double CaMV 35S promoter, P-nos=nopaline synthase promoter).

FIG. 12 is a schematic illustration of the cloning strategy used to isolate a second cassava SBE II gene. The top line represents the size of a full length clone with distances in kilobases (kb) and arrows representing oligonucleotides (rightward pointing arrows are sense strand, leftward are on opposite strand). The long thick arrow is the open reading frame with start and stop codons shown. Below this are shown the 3′ RACE, 5′ RACE and PCR clones identified either by the plasmid name (shown in brackets above the line) or the clone number (shown to the right of the clone).

FIG. 13 shows the DNA sequence (SEQ ID NO: 30) and predicted ORF (SEQ ID NO: 31) of a second full length cassava SBE II tuber cDNA in pSJ146. Nucleotides 35-2760 are SBE II sequence (SEQ ID NO: 52) and the remainder are from the pT7Blue vector (SEQ ID NO: 53). The DNA sequence of FIG. 13 is Seq ID No. 30, and the amino acid sequence is Seq ID No. 31, in the attached sequence listing.

EXAMPLE 1

This example relates to the isolation and cloning of SBE II sequences from cassava.

Recombinant DNA Manipulations

Standard procedures were performed essentially according to Sambrook et al. (1989 Molecular cloning A laboratory manual, 2nd edn. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). DNA sequencing was performed on an ABI automated DNA sequencer and sequences manipulated using DNASTAR software for the Macintosh.

Rapid Amplification of cDNA Ends (RACE) and PCR Conditions

5′ and 3′ RACE were performed essentially according to Frohman et al., (1988 Proc. Natl. Acad. Sci. USA 85, 8998-9002) but with the following modifications.

For 3′ RACE, 5 μg of total RNA was reverse transcribed using 5 pmol of the RACE adaptor RoRidT17 as primer and Stratascript RNAse H-reverse transcriptase (50 U) in a 50 μl reaction according to the manufacturer's instructions (Stratagene). The reaction was incubated for 1 hour at 37° C. and then diluted to 200 μl with TE (10 mM Tris HCl, 1 mM EDTA) pH 8 and stored at 4° C. 2.5 μl of this cDNA was used in a 25 μl PCR reaction with 12.5 pmol of SBE A and Ro primers for 30 cycles of 94° C. 45 sec, 50° C. 25 sec, 72° C. 1 min 30 sec. A second round of PCR (25 cycles) was performed using 1 μl of this reaction as template in a 50 μl reaction under the same conditions. Amplified products were separated by agarose gel electrophoresis and cloned into the pT7Blue vector (Invitrogen).

For the first round of 5′ RACE, 5 μg of total leaf RNA was reverse transcribed as described above using 10 pmol of the SBE II gene specific primer CSBE22. This primer was removed from the reaction by diluting to 500 μl with TE and centrifuging twice through a centricon 100 microconcentrator. The concentrated cDNA was then dA-tailed with 9U of terminal deoxynucleotide transferase and 50 μM dATP in a 20 μl reaction in buffer supplied by the manufacturer (BRL). The reaction was incubated for 10 min at 37° C. and 5 min at 65° C. and then diluted to 200 μl with TE pH 8. PCR was performed in a 50 μl volume using 5 μl of tailed cDNA. 2.5 pmol of RoRidT17 and 25 pmol of Ro and CSBE24 primers for 30 cycles of 94° C. 45 sec, 55° C. 25 sec, 72° C. 3 min. Amplified products were separated on a 1% TAE agarose gel, cut out, 200μl of TE was added and melted at 99° C. for 10 min. Five μl of this was re-amplified in a 50 μl volume using CSBE25 and Ri as primers and 25 cycles of 94° C. 45 sec, 55° C. 25 sec, 72° C. 1 min 30 sec. Amplified fragments were separated on a 1% TAE agarose gel, purified on DEAE paper and cloned into pT7Blue.

The second round of 5′ RACE was performed using CSBE28 and 29 primers in the first and second round PCR reactions respectively using a new A-tailed cDNA library primed with CSBE27.

A third round of 5′ RACE was performed on the same CSBE27 primed cDNA.

Repeat 3′ RACE and PCR Cloning

The 3′ RACE library (RoRidT17 primed leaf RNA) was used as a template. The first PCR reaction was diluted 1:20 and 1 μl was used in a 50 μl PCR reaction with SBE A and Ri primers and the products were cloned into pT7Blue. The cloned PCR products were screened for the presence or absence of the CSBE23 oligo by colony PCR.

A full length cDNA of cassava SBE II was isolated by PCR from leaf or root cDNA (RoRidT17 primed) using primers CSBE214 and CSBE218 from 2.5 μl of cDNA in a 25 μl reaction and 30 cycles of 94° C. 45 sec, 55° C. 25 sec, 72° C. 2 min.

Complementation of E. coli Mutant KV832

SBE II containing plasmids were transformed into the branching enzyme deficient mutant E. coli KV832 (Keil et al., 1987 Mol. Gen. Genet. 207, 294-301) and cells grown on solid PYG media (0.85% KH₂PO₄, 1.1% K₂HPO₄, 0.6% yeast extract) containing 1.0% glucose. To test for complementation, a loop of cells was scraped off and resuspended in 150 μl water to which was added 15 μl of Lugol's solution (2 g KI and 1 g I₂ per 300 ml water).

RNA Isolation

RNA was isolated from cassava plants by the method of Logemann (1987 Anal. Biochem. 163, 21-26). Leaf RNA was isolated from 0.5 gm of in vitro grown plant tissue. The total yield was 300 μg. Three month old roots (89 gm) were used for isolation of root RNA).

SBE II specific oligonucleotides SBE A ATGGACAAGGATATGTATGA (Seq ID No. 1) CSBE21 GGTTTCATGACTTCTGAGCA (Seq ID No. 2) CSBE22 TGCTCAGAAGTCATGAAACC (Seq ID No. 3) CSBE23 TCCAGTCTCAATATACGTCG (Seq ID No. 4) CSBE24 AGGAGTAGATGGTCTGTCGA (Seq ID No. 5) CSBE25 TCATACATATCCTTGTCCAT (Seq ID No. 6) CSBE26 GGGTGACTTCAATGATGTAC (Seq ID No. 7) CSBE27 GGTGTACATCATTGAAGTCA (Seq ID No. 8) CSBE28 AATTACTGGCTCCGTACTAC (Seq ID No. 9) CSBE29 CATTCCAACGTGCGACTCAT (Seq ID No. 10) CSBE210 TACCGGTAATCTAGGTGTTG (Seq ID No. 11) CSBE211 GGACCTTGGTTTAGATCCAA (Seq ID No. 12) CSBE212 ATGAGTCGCACGTTGGAATG (Seq ID No. 13) CSBE213 CAACACCTAGATTACCGGTA (Seq ID No. 14) CSBE214 TTAGTTGCGTCAGTTCTCAC (Seq ID No. 15) CSBE215 AATATCTATCTCAGCCGGAG (Seq ID No. 16) CSBE216 ATCTTAGATAGTCTGCATCA (Seq ID No. 17) CSBE217 TGGTTGTTCCCTGGAATTAC (Seq ID No. 18) CSBE218 TGCAAGGACCGTGACATCAA (Seq ID No. 19) Results Cloning of a SBE II Gene from Cassava Leaf

The strategy for cloning a full length cDNA of starch branching enzyme II of cassava is shown in FIG. 1. A comparison of several SBE II (class A) SBE DNA sequences identified a 23 bp region which appears to be completely conserved among most genes (data not shown) and is positioned about one kilobase upstream from the 3′ end of the gene. An oligonucleotide primer (designated SBE A) was made to this sequence and used to isolate it partial cDNA clone by 3′ RACE PCR from first strand leaf cDNA as illustrated in FIG. 1. An approximately 1100 bp band was amplified, cloned into pT7Blue vector and sequenced. This clone was designated pSJ94 and contained a 1120 bp insert starting with the SBE A oligo and ending with a polyA tail. There was a predicted open reading frame of 235 amino acids which was highly homologous (79% identical) to a potato SBE II also isolated by the inventors (data not shown) suggesting that this clone represented a class A (SBE II) gene.

To obtain the sequence of a full length clone nested primers were made complementary to the 5′ end of this sequence and used in 5′ RACE PCR to isolate clones from the 5′ region of the gene. A total of three rounds of 5′ RACE was needed to determine the sequence of the complete gene (i.e. one that has a predicted long ORF preceded by stop codons). It should be noted that during this cloning process several clones (#23, 9, 16) were obtained that had small deletions and in one case (clone 23) there was also a small (120 bp) intron present. These occurrences are not uncommon and probably arise through errors in the PCR process and/or reverse transcription of incompletely processed RNA (heterogeneous nuclear RNA).

The overlapping cDNA fragments could be assembled into a contiguous 3 kb sequence (designated csbe2con.seq) which contained one long predicted ORF as shown in FIG. 2. Several clones in the last round of 5′ RACE were obtained which included sequence of the untranslated leader (UTL). All of these clones had an ORF (42 amino acids) 46 bp upstream and out of frame with that of the long ORF.

There Is More than One SBE II Gene in Cassava

In order to determine if the assembled sequence represented that of a single gene, attempts were made to recover by PCR a full length SBE II gene using primers CSBE214 and CSBE23 at the 5′ and 3′ ends of the csbe2con sequence respectively. All attempts were unsuccessful using either leaf or root cDNA as template. The PCR was therefore repeated with either the 5′- or 3′- most primer and complementary primers along the length of the SBE II gene to determine the size of the largest fragment that could be amplified. With the CSBE214 primer, fragments could he amplified using primers 210, 28, 27 and 22 in order of increasing distance, the latter primer pair amplifying a 2.2 kb band. With the 3′ primer CSBE23, only primer pairs with 21 and 26 gave amplification products, the latter being about 1200 bp. These results suggest that the original 3′ RACE clone (pSJ94) is derived from a different SBE II gene than the rest of the 5′ RACE clones even though the two largest PCR fragments (214+22 and 26+23) overlap by 750 bp and share several primer sites. It is likely that the sequence of the two genes starts to diverge around the CSBE22 primer site such that the 3′ end of the corresponding gene does not contain the 23 primer and is not therefore able to amplify a cDNA when used with the 214 primer.

To confirm this, the sequence of the longest 5′ PCR fragment (214+22) from two clones (#20 designated pSJ99, & #35) was determined and compared to the consensus sequence csbe2con as shown in FIG. 3. The first 2000 bases are nearly identical (the single base changes might well be PCR errors), however the consensus sequence is significantly different after this. This region corresponds to the original 3′ RACE fragment pSJ94 (SBE A+Ri adaptor) and provided evidence that there may be more than one SBE II gene in cassava.

The 3′ end corresponding to pSJ99 was therefore cloned as follows: 3′ RACE PCR was performed on leaf cDNA using the SBE A oligo as the gene specific primer so that all SBE II genes would be amplified. The cloned DNA fragments were then screened for the presence or absence of the CSBE23 primer by PCR. Two out of 15 clones were positive with the SBE A+Ri primer pair but negative with SBE A+CSBE23 primers. The sequence of these two clones (designated pSJ101, as shown in FIG. 9) demonstrated that they were indeed from an SBE II gene and that they were different from pSJ94. However the overlapping region of pSJ101 (the 3′ clone) and pSJ99 (the 5′ clone) was identical suggesting that they were derived from the same gene.

To confirm this a primer (CSBE218, SEQ. ID. NO. 19) was made to a region in the 3′ UTR (untranslated region) of pSJ101 and used in combination with CSBE214 (SEQ. ID. NO. 15) primer to recover by PCR a full length cDNA from both leaf and root cDNA. These clones were sequenced and designated pSJ106 & pSJ107 respectively. The sequence and predicted ORF of pSJ101 is shown in FIG. 4 (SEQ. ID. NO. 28). The long ORF in plasmid pSJ106 was found to be interrupted by a stop codon (presumably introduced in the PCR process) approximately 1 kb from the 3′ end of the gene, therefore another cDNA clone (designated pSJ116) was amplified in a separate reaction, cloned and sequenced. This clone had an intact ORF (data not shown).

There were only a few differences in these two sequences (in the transit peptide aa 27-41: YRRTSSCLSFNFKEA (SEQ ID NO: 34) to DRRTSSCLSFIFKKAA (SEQ ID NO: 35) and L831 in pSJ107 to V in pSJ116 respectively).

An additional 740 bp of sequence of the gene corresponding to the pSJ94 clone was isolated by 5′ RACE using the primers CSBE216 and 217, and was designated pSJ125.

This sequence was combined with that of pSJ94 to form a consensus sequence “125+94”, as shown in FIG. 10. The sequence of this second gene is about 90% identical at the DNA and protein level to pSJ116, as shown in FIG. 5 and 6, and is clearly a second form of SBE II in cassava. The 3′ untranslated regions of the two genes are not related (data not shown).

It was also determined that the full length cassava SBE II genes (from both leaf and tuber) actually encode for active starch branching enzymes since the cloned genes were able to complement the glycogen branching enzyme deficient E. coli mutant KV832.

Main Findings

1) A full length cDNA clone of a starch branching enzyme II (SBE II) gene has been cloned from leaves and starch storing roots of cassava. This cDNA encodes a 836 amino acid protein (Mr 95 Kd) and is 86% identical to pea SBE I over the central conserved domain, although the level of sequence identity over the entire coding region is lower than 86%.

2) There is more than one SBE II gene in cassava as a second partial SBE II cDNA was isolated which differs slightly in the protein coding region from the first gene and has no homology in the 3′ untranslated region.

3) The isolated full length cDNA from both leaves and roots encodes an active SBE as it complements an E. coli mutant deficient in glycogen branching enzyme as assayed by iodine staining.

We have shown that there are SBE II (Class A) gene sequences present in the cassava genome by isolating cDNA fragments using 3′ and 5′ RACE. From these cDNA fragments a consensus sequence of over 3 kb could be compiled which contained one long open reading frame (FIG. 2) which is highly homologous to other SBE II (class A) genes (data not shown). It is likely that the consensus sequence does not represent that of a single gene since attempts to PCR a full length gene using primers at the 5′ and 3′ ends of this sequence were not successful. In fact screening of a number of leaf derived 3′ RACE cDNAs showed that a second SBE II gene (clone designated pSJ101) was also expressed which is highly homologous within the coding region to the originally isolated cDNA (pSJ94) but has a different 3′ UTR. A full length SBE II gene was isolated from leaves and roots by PCR using a new primer to the 3′ end of this sequence and the original sequence at the 5′ end of the consensus sequence. If the frequency of clones isolated by 3′ RACE PCR reflects the abundance of the mRNA levels then this full length gene may he expressed at lower levels in the leaf than the pSJ94 clone (2 out of 15 were the former class, 13/15 the latter). It should be noted that each class is expressed in both leaves and roots as judged by PCR (data not shown). Sequence analysis of the predicted ORF of the leaf and root genes showed only a few differences (4 amino acid changes and one deletion) which could have arisen through PCR errors or, alternatively, there may be more than one nearly identical gene expressed in these tissues.

A comparison of all known SBE II protein sequences shows that the cassava SBE II gene is most closely related to the pea gene (FIG. 8). The two proteins are 86.3% identical over a 686 amino acid range which extends from the triple proline “elbow” (Burton et al., 1995 Plant J. 7, 3-15) to the conserved VVYA (SEQ ID NO: 36) sequence immediately preceding the C-terminal extensions (data not shown). All SBE II proteins are conserved over this range in that they are at least 80% similar to each other. Remarkably however, the sequence conservation between the pea, potato and cassava SBE II proteins also extends to the N-terminal transit peptide, especially the first 12 amino acids of the precursor protein and the region surrounding the mature terminus of the pea protein (AKFSRDS (SEQ ID NO: 37)). Because the proteins are so similar around this region it can be predicted that the mature terminus of the cassava SBE II protein is likely to be GKSSHES (SEQ ID NO: 38). The precursor has a predicted molecular mass of 96 kD and the mature protein a predicted molecule mass of 91.3 kD. The cassava SBE II has a short acidic tail at the C-terminal although this is not as long or as acidic as that found in the pea or potato proteins. The significance of this acidic tail, if any, remains to be determined. One notable difference between the amino acid sequence of cassava SBE II and all other SBE II proteins is the presence of the sequence NSKH (SEQ ID NO: 32) at around position 697 instead of the conserved sequence DAD/EY (SEQ ID NO: 33). Although this conserved region forms part of a predicted α-helix (number 8) of the catalytic (β/α)₈ barrel domain (Burton et al 1995 cited previously), this difference does not abolish the SBE activity of the cassava protein as this gene can still complement the glycogen branching deletion mutant of E. coli. It may however affect the specificity of the protein. An interesting point is that the other cassava SBE II clone pSJ94 has the conserved sequence DADY (SEQ ID NO. 33).

One other point of interest concerning the sequence of the SBE II gene is the presence of an upstream ATG in the 5′ UTR. This ATG could initiate a small peptide of 42 amino acids which would terminate downstream of the predicted initiating methionine codon of the SBE II precursor. If this does occur then the translation of the SBE II protein from this mRNA is likely to be inefficient as ribosomes normally initiate at the 5′ most ATG in the mRNA. However the first ATG is in a poorer Kozak context than the SBE II initiator and it may be too close to the 5′ end of the message to initiate efficiently (14 nucleotides) thus allowing initiation to occur at the correct ATG.

In conclusion we have shown that cassava does have SBE II gene sequences, that they are expressed in both leaves and tubers and that more than one gene exists.

EXAMPLE 2

Cloning of a Second Full Length Cassava SBE II Gene

Methods

Oligonucleotides CSBE219 CTTTATCTATTAAAGACTTC (Seq ID No. 20) CSBE220 CAAAAAAGTTTGTGACATGG (Seq ID No. 21) CSBE221 TCACTTTTTCCAATGCTAAT (Seq ID No. 22) CSBE222 TCTCATGCAATGGAACCGAC (Seq ID No. 23) CSBE223 CAGATGTCCTGACTCGGAAT (Seq ID No. 24) CSBE224 ATTCCGAGTCAGGACATCTG (Seq ID No. 25) CSBE225 CGCATTTCTCGCTATTGCTT (Seq ID No. 26) CSBE226 CACAGGCCCAAGTGAAGAAT (Seq ID No. 27)

The 5′ end of the gene corresponding to the 3′ RACE clone pSJ94 was isolated in three rounds of 5′ RACE. Prior to performing the first round of 5′ RACE, 5 μg of total leaf RNA was reverse transcribed in a 20 μl reaction using conditions as decribed by the manufacturer (Superscript enzyme, BRL) and 10 pmol of the SBE II gene specific primer CSBE23. Primers were then removed and the cDNA tailed with dATP as described above. The first round of 5′ RACE used primers CSBE216 and Ro. This PCR reaction was diluted 1:20 and used as a template for a second round of amplification using primers CSBE217 and Ri. The gene specific primers were designed so that they would preferentially hybridise to the SBE II sequence in pSJ94. Amplified products appeared as a smear of approximately 600-1200 bp when subjected to electrophoresis on a 1% TAE agarose gel.

This smear was excised and DNA purified using a Qiaquick column (Qiagen) before ligation to the pT7Blue vector. Several clones were sequenced and clone #7 was designated pSJ125. New primers (CSBE219 and 220) were designed to hybridise to the 5′ end of pSJ125 and a second round of 5′ RACE was performed using the same CSBE23 primed library. Two fragments of 600 and 800 bp were cloned and sequenced (clones 13,17). Primers CSBE221 and 222 were designed to hybridise to the 5′ sequence of the longest clone (#13) and a third round of 5′ RACE was performed on a new library (5 μg total leaf RNA reverse transcribed with Superscript using CSBE220 as primer and then dATP tailed with TdT from Boehringer Mannheim). Fragments of approximately 500 bp were amplified, cloned and sequenced. Clone #13, was designated pSJ143. The process is illustrated schematically in FIG. 12.

To isolate a full length gene as a contiguous sequence, a new primer (CSBE225) was designed to hybridise to the 5′ end of clone pSJ143 and used with one of the primers (CSBE226 or 23) in the 3′ end of clone pSJ94, in a PCR reaction using RoRidT17 primed leaf cDNA as template. Use of primer CSBE226 resulted in production of Clone #2 (designated pSJ144), and use of primer CSBE23 resulted in production of Clones #10 and 13 (designated pSJ145 and pSJ146 respectively). Only pSJ146 was sequenced fully.

Results

Isolation of a Second Full Length Cassava SBE II Gene

A full length clone for a second SBE II gene was isolated by extending the sequence of pSJ94 in three rounds of 5′ RACE as illustrated schematically in FIG. 12. In each round of 5′ RACE, primers were designed that would preferentially hybridise to the new sequence rather than to the gene represented by pSJ 116. In the final round of 5′ RACE, three clones were obtained that had the initiating methione codon, and none of these had upstream ATGs. The overlapping cDNA fragments (sequences of the 5′ RACE clones pSJ143, 13, pSJ125 and the 3′ RACE clone pSJ94) could be assembled into a consensus sequence of approximately 3 kb which was designated csbe2-2.seq. This sequence contained one long ORF with a predicted size of 848 aa (M, 97 kDa). The full length gene was then isolated as a contiguous sequence by PCR amplification from RoRidT17 primed leaf cDNA using primers at the 5′ (CSBE225) and 3′ (CSBE23 or CSBE226) ends of the RACE clones. One clone, designated pSJ146, was sequenced and the restriction map is shown along with the predicted amino acid sequence in FIG. 13.

Sequence Homologies Between SBE II Genes

The two cassava genes (pSJ116 and pSJ146) share 88.8% identity at the DNA level over the entire coding region (data not shown). The homology extends about 50 bases outside of this region but beyond this the untranslated regions show no similarity (data not shown). At the protein level the two genes show 86% identity over the entire ORF (data not shown). The two genes are more closely related to each other than to any other SBE II. Between species, the pea SBE I shows the most homology to the cassava SBE II genes.

EXAMPLE 3

Construction of Plant Transformation Vectors and Transformation of Cassava with Antisense Starch Branching Enzyme Genes

This example describes in detail how a portion of the SBE II gene isolated from cassava may be introduced into cassava plants to create transgenic plants with altered properties.

An 1100 bp Hind III—Sac I fragment of cassava SBE II (from plasmid pSJ94) was cloned into the Hind III—Sac I sites of the plant transformation vector pSJ64 (FIG. 11). This placed the SBE II gene in an antisense orientation between the 2×35S CaMV promoter and the nopaline synthase polyadenylation signal. pSJ64 is a derivative of the binary vector pGPTV-HYG (Becker et al., 1992 Plant Molecular Biology 20: 1195-1197) modified by inclusion of an approximately 750 bp fragment of pJIT60 (Guerineau et al 1992 Plant Mol. Biol. 18, 815-818) containing the duplicated cauliflower mosaic virus (CaMV) 35S promoter (Cabb-JI strain, equivalent to nucleotides 7040 to 7376 duplicated upstream of 7040 to 7433, as described by Frank et al., 1980 Cell 21, 285-294) to replace the GUS coding sequence. A similar construct was made with the cassava SBE II sequence from plasmid pSJ101.

These plasmids are then introduced into Agrobacterium tumefaciens LBA4404 by a direct DNA uptake method (An et al. Binary vectors, In: Plant Molecular Biology Manual (ed Galvin and Schilperoort) AD 1988 pp 1-19) and can be used to transform cassava somatic embryos by selecting on hygromycin as described by Li et al. (1996, Nature Biotechnology 14, 736-740). 

1. An isolated nucleic acid from cassava, or its complement, wherein the isolated nucleic acid encodes a polypeptide having starch branching enzyme Class A (SBEII) activity and the amino acid sequence of SEQ ID NO:
 29. 2. The isolated nucleic acid according to claim 1, or its complement, wherein the isolated nucleic acid comprises the nucleic acid sequence of SEQ ID NO:
 28. 3. The isolated nucleic acid according to claim 2, wherein the nucleic acid further comprises a 5′ and/or a 3′ untranslated region.
 4. The isolated nucleic acid according to claim 1, or its complement, wherein the isolated nucleic acid comprises nucleotides 21-2531 of the nucleic acid sequence of SEQ ID NO:
 28. 5. A construct comprising a nucleic acid from cassava, wherein said nucleic acid has at least 88% sequence identity to SEQ ID NO: 28 and wherein said nucleic acid encodes a protein with SBE II activity.
 6. The construct of claim 5, further comprising a promoter operable in plants, wherein said promoter is operably linked to the nucleic acid.
 7. The construct of claim 6, wherein the nucleic acid is in the sense or the anti-sense orientation.
 8. A plant cell, plant tissue, or plant comprising the construct of claim 5 or
 6. 9. A method of producing a transformed cassava plant, wherein the method comprises introducing into a cell of a cassava plant the construct of claim 5 or 6 to produce a transformed cassava cell and regenerating a transformed cassava plant from the transformed cassava cell.
 10. A method of producing a transformed progeny cassava plant, wherein the method comprises introducing into a cell of a cassava plant a construct comprising a nucleic acid from cassava to produce a transformed cassava cell, wherein said nucleic acid has at least 88% sequence identity to SEQ ID NO: 28 and wherein said nucleic acid encodes a protein with SBE II activity; regenerating a transformed cassava plant from the transformed cassava cell; sexually crossing the transformed cassava plant with a second cassava plant, wherein the second cassava plant is not transformed with said nucleic acid; harvesting the resultant seed; growing the harvested seed; and selecting a transformed cassava progeny plant which comprises the nucleic acid. 